Multidialectal Acoustic Modeling: a Comparative Study
نویسندگان
چکیده
In this paper, multidialectal acoustic modeling based on sharing data across dialects is addressed. A comparative study of different methods of combining data based on decision tree clustering algorithms is presented. Approaches evolved differ in the way of evaluating the similarity of sounds between dialects, and the decision tree structure applied. Proposed systems are tested with Spanish dialects across Spain and Latin America. All multidialectal proposed systems improve monodialectal performance using data from another dialect but it is shown that the way to share data is critical. The best combination between similarity measure and tree structure achieves an improvement of 7% over the results obtained with monodialectal systems.
منابع مشابه
Multidialectal Spanish acoustic modeling for speech recognition
During the last years, language resources for speech recognition have been collected for many languages and specifically, for global languages. One of the characteristics of global languages is their wide geographical dispersion, and consequently, their wide phonetic, lexical, and semantic dialectal variability. Even if the collected data is huge, it is difficult to represent dialectal variants...
متن کاملMultidialectal Spanish Modeling for ASR
This paper describes the latest advances in our ongoing work in the area of Spanish multidialectal speech recognition. This work deals with the suitability of using a single multidialectal acoustic modeling for all the Spanish variants spoken in Europe and Latin America. The objective is two fold. First, it allows to use all the available databases to jointly train and improve the same system. ...
متن کاملMonodialectal and multidialectal infants' representation of familiar words.
Monolingual infants are typically studied as a homogenous group and compared to bilingual infants. This study looks further into two subgroups of monolingual infants, monodialectal and multidialectal, to identify the effects of dialect-related variation on the phonological representation of words. Using an Intermodal Preferential Looking task, the detection of mispronunciations in familiar word...
متن کاملData driven multidialectal phone set for Spanish dialects
This paper addresses the use of a data-driven approach to determine a multidialectal phone set for an automatic speech recognition system for Spanish dialects. This approach is based on a decision tree clustering algorithm that tries to cluster contextual units of different dialects. This procedure avoids the definition of a global phonetic inventory and the previous study of similarity of soun...
متن کاملA High Order Approximation of the Two Dimensional Acoustic Wave Equation with Discontinuous Coefficients
This paper concerns with the modeling and construction of a fifth order method for two dimensional acoustic wave equation in heterogenous media. The method is based on a standard discretization of the problem on smooth regions and a nonstandard method for nonsmooth regions. The construction of the nonstandard method is based on the special treatment of the interface using suitable jump conditio...
متن کامل